Symposium on Theoretical Aspects of Computer Science 2010 (Nancy, France), pp. 501-512 
www.stacs-conf.org 



COLLAPSIBLE PUSHDOWN GRAPHS OF LEVEL 2 ARE 

TREE- AUTOMATIC 



ALEXANDER KARTZOW 1 



TU Darmstadt, Fachbereich Mathematik, Schlossgartenstr. 7, 64289 Darmstadt, Germany 



Abstract. We show that graphs generated by collapsible pushdown systems of level 2 
are tree-automatic. Even when we allow e-contractions and add a reachability predicate 
(with regular constraints) for pairs of configurations, the structures remain tree-automatic. 
Hence, their FO theories are decidable, even when expanded by a reachability predicate. 
As a corollary, we obtain the tree-automaticity of the second level of the Caucal-hierarchy. 



1. Introduction 

Higher-order pushdown systems were first introduced by Maslov \10\ [TT] as accepting 
devices for word languages. Later, Knapik et al. [8] studied them as generators for trees. 
They obtained an equi-expressivity result for higher-order pushdown systems and for higher- 
order recursion schemes that satisfy the constraint of safety, which is a rather unnatural 
syntactic condition. Recently, Hague et al. [6] introduced collapsible pushdown systems as 
extensions of higher-order pushdown systems and proved that these have exactly the same 
power as higher-order recursion schemes as methods for generating trees. 

Both - higher-order and collapsible pushdown systems - also form interesting devices 
for generating graphs. Carayol and Wohrle [3] showed that the graphs generated by higher- 
order pushdown systems^ of level I coincide with the graphs in the l-th level of the Caucal- 
hierarchy, a class of graphs introduced by Caucal [4J. Every level of this hierarchy is 
obtained from the preceding level by applying graph unfoldings and MSO interpretations. 
Both operations preserve the decidability of the MSO theory whence the Caucal-hierarchy 
forms a rather large class of graphs with decidable MSO theories. If we use collapsible 
pushdown systems as generators for graphs we obtain a different situation. Hague et al. 
showed that even the second level of the hierarchy contains a graph with undecidable MSO 
theory. But they showed the decidability of the modal //-calculus theories of all graphs in the 
hierarchy. This turns graphs generated by collapsible pushdown systems into an interesting 
class from a model theoretic point of view. There are few natural classes that share these 
properties. In fact, the author only knows one further example, viz. nested pushdown 
trees. Alur et al.|T] introduced these graphs for /^-calculus model checking purposes. We 
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proved in [7] that nested pushdown trees also have decidable first-order theories. We gave an 
effective model checking algorithm using pumping techniques, but we also proved that nested 
pushdown trees are tree-automatic structures. Tree-automatic structures were introduced 
by Blumensath [2]. These structures enjoy decidable first-order theories due to the good 
closure properties of finite automata on trees. 

In this paper, we are going to extend our previous result to the second level of the 
collapsible pushdown hierarchy. All graphs of the second level are tree-automatic. This 
subsumes our previous result as nested pushdown trees are first-order interpret able in col- 
lapsible pushdown graphs of level two. Furthermore, we show that collapsible pushdown 
graphs of level 2 are still tree-automatic when expanded by a reachability predicate, i.e., 
by the binary relation which contains all pairs of configurations such that there is a path 
from the first to the second configuration. Thus, first-order logic extended by reachability 
predicates is decidable on level 2 collapsible pushdown graphs. 

In the next section, we introduce the necessary notions concerning tree-automaticity 
and in Section [3] we define collapsible pushdown graphs. We explain the translation of 
configurations into trees in Section HI Section [5] is a sketch of the proof that this translation 
yields tree-automatic representations of collapsible pushdown graphs, even when enriched 
with certain regular reachability predicates. The last section contains some concluding 
remarks about questions arising from our result. 



2. Preliminaries 

We write MSO for monadic second order logic and FO for first-order logic. For words 
w\,W2 G £*, we write W\ n W2 for the greatest common prefix of w\ and W2- A H-labelled 
tree is a function T : D — > £ for a finite D C {0, 1}* which is closed under prefixes. 
For d G D we denote by the subtree rooted at d. 

Sometimes it is useful to define trees inductively by describing their left and right 
subtrees. For this purpose we fix the following notation. Let Tq and T\ be S-labelled trees 
and a G S. Then we write T := o-(Tq, T\) for the S-labelled tree T with the following three 
properties 

1. T(e) = a, 2. T = f , and 3. T x = f x . 

In the rest of this section, we briefly present the notion of a tree-automatic structure 
as introduced by Blumensath [2]. 

The convolution of two S-labelled trees T and T" is given by a function 

T <8> T' : dom(T) U dom(T') -> (E U {D}) 2 

where □ is a new symbol for padding and 

' (T(d),T'(d)) if d G dom(T) n dom(T') 
(T T')(d) := I (T(d), □) if d G dom(T) \ dom(T') 

k (D,T'(d)) if d G dom(T') \ dom(T) 

By "tree-automata" we mean a nondeterministic finite automaton that labels a finite tree 
top-down. 

Definition 2.1. A structure 53 = (£?, E\, E%, . . . , E n ) with domain B and binary rela- 
tions Ei is tree- automatic if there are tree-automata Ab,Ae± , Ae 2 , ■ ■ ■ , Ae„ and a bijection 
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/ : L — )■ B for L the language accepted by Ab such that the following hold. For T, T' G L, 
the automaton Ag. accepts T <g> T' if and only if (/(T), f(T')) G -Ej. 

Tree-automatic structures form a nice class because automata theoretic techniques may 
be used to decide first-order formulas on these structures: 

Lemma 2.2 ([2]). If B is tree- automatic, then its first- order theory is decidable. 

We will use the classical result that regular sets of trees are MSO definable. 

Theorem 2.3 (|12|. [5]). For a set f of finite "E-labelled, trees, there is a tree automaton 
recognising T if and only iff is MSO definable. 



3. Definition of Collapsible Pushdown Graphs (CPG) 

In this section we define our notation of collapsible pushdown systems. For a more 
comprehensive introduction, we refer the reader to [6]. 

3.1. Collapsible Pushdown Stacks 

First, we provide some terminology concerning stacks of (collapsible) higher-order push- 
down systems. We write S* 2 for (£*)* and S +2 for We call an s G E* 2 a 2- word. 

Let us fix a 2-word s G E* 2 which consists of an ordered list w\, W2, ■ ■ ■ , w m G £*. We 
separate the words of this list by colons writing s = w\ : W2 ■ ■ ■ ■ ■ w m . By \s\ we denote 
the number of words s consists of, i.e., \s\ = to. 

For another word s' = : w' 2 ■ ■ ■ ■ ■ w' n G X* 2 , we write s : s' for the concatenation 
w\ : W2 : • • • : w m : w[ : w' 2 : ■ ■ ■ ■ w' n . 

If w G £*, we write [io] for the 2- word that consists of a list of one word which is w. 

A level 2 collapsible pushdown stack is a special element of (S x {1,2} x N) +2 that 
is generated by certain stack operations from an initial stack which we introduce in the 
following definitions. The natural numbers following the stack symbol represent the so- 
called collapse pointer: every element in a collapsible pushdown stack has a pointer to some 
substack and applying the collapse operation returns the substack to which the topmost 
symbol of the stack points. Here, the first number denotes the collapse level. If it is 1 the 
collapse pointer always points to the symbol below the topmost symbol and the collapse 
operations just removes the topmost symbol. The more interesting case is when the collapse 
level of the topmost symbol of the stack s is 2. Then the stack obtained by the collapse 
contains the first n words of s where n is the second number in the topmost element of s. 

The initial level 1 stack is J_i := (_L, 1,0) and the initial level 2 stack is J_2 := [J-i]. 

For k G {1, 2} and for a 2-word s = w\ : ui2 : . . . : w n G (S x {1, 2} x N) +2 such that 
Wn = ai«2 • • • dm with cij G E x {1, 2} x N for all 1 < i < m: 

• we define the topmost (k — \)-word of s as top fc (s) := < n 

[a m if k = 1 

• for top 1 (s) = (a,i,j) G S x {1,2} x N, we define the topmost symbol Sym(s) := a, 
the collapse-level of the topmost element CLvl(s) := i, and the collapse-link of the 
topmost element CLnk(s) := j. 
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For s, w n and k as before, a G E\{_L}, and w' n := a\ . . . a m _i, we define the stack operations 

( 



clone2(s) : = 
Push k (s) :-- 



collapse(s) 



w± : w<i : . . 
w\ : W2 : . . 
undefined 



if k = 2, n > 2 
u4 if A; = 1, m > 2 
otherwise 



w\ : W2 ■ ■ ■ ■ ■ w n -i : w n :w n 

J : W2 ■ ■ ■ ■ ■ w n (a, 2, n — 1) if k=2 
[ wi : W2 : • • • : w n (a, 1, m) if k=l 



t^i : W2 ■ • 

POPi(s) 

undefined 



if CLvl(s) 
if CLvl(s) 
otherwise 



2, CLnk(s) 
1 



r > 



The set of level 2- operations is OP := {push^, push CT25 clone2, popx, pop 2 , collapse}. The 
set of level 2 stacks, Stck(S), is the smallest set that contains _L 2 and is closed under all 
operations from OP. 

Note that collapse- and pop fc -operations are only allowed if the resulting stack is in 
This avoids the special treatment of empty words or stacks. Furthermore, a collapse 
on level 2 summarises a non-empty sequence of pop 2 -operations. For example, starting from 
± 2) we can apply a clone 2 , a push . 2 , a clone 2 , and finally a collapse. This sequence first 
creates a level 2 stack that contains 3 words and then performs the collapse and ends in the 
initial stack again. This example shows that clone 2 -operations are responsible for the fact 
that collapse-operations on level 2 may remove more than one word from the stack. 

For s, s' G Stck(S), we call s' a substack of s if there are ni,n 2 G N such that 
s' = pop 1 ni (pop 2 n2 (s)). We write s' < s if s' is a substack of s. 



3.2. Collapsible Pushdown Systems and Collapsible Pushdown Graphs 

Now we introduce collapsible pushdown systems and graphs (of level 2) which are 
analogues of pushdown systems and pushdown graphs using collapsible pushdown stacks 
instead of ordinary stacks. 

Definition 3.1. A collapsible pushdown system of level 2 (CPS) is a tuple S = (£, Q, A, qo) 
where S is a finite stack alphabet with _L G S, Q a finite set of states, qo G Q the initial 
state, and ACQxSxQx OP the transition relation. 

For q G Q and s G Stck(E) the pair (q, s) is called a configuration. We define la- 
belled transitions on pairs of configurations by setting (qi,s) h( 92,op ) (q2,t) if there is a 
(qi,cr, q2,op) G A such that Sym(s) = a and op{s) = t. The union of the labelled transition 
relations is denoted as h:= UieQxOP ^ e se * C(S) to be the set of all configurations 
that are reachable from (go ; -L2) via h-paths. We call C(S) the set of reachable or valid 
configurations. The collapsible pushdown graph (CPG) generated by S is 

CPG(S) := (C(S), (C(S) 2 n h-%eQxOp) 

Example 3.2. The following example of a collapsible pushdown graph of level 2 is taken 
from [6]. Let Q := {0, 1,2}, S := {±,a}, and A given by (0, *, 1, clone 2 ), (1, *, 0, push a 2 ), 
(1, *, 2, push a 2 ), (2, a, 2, popjj, and (2, a, 0, collapse), where * denotes any letter in S. In our 
picture (see Figured]), the labels are abbreviated as follows: cl := (1, clone 2 ), a := (0, push a 2 ), 
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a' := (2,push a2 ), p '■= (2,pop 1 ), and co := (0, collapse). 



•1,_L: 


: La : 


La 


2^ 


La : 


Laa 




: La : 


La 




\> 




2,-L 


: La 


: L 



0, _L ^j-l, L : 

2, _L : _La~~~~~---C 2, _L : La : ±aa~~~~ — — 2, _L : _I_a : Laa : laaa 

co ^^\ — — — 

2,1 : _L 2.L: La: La~~~~~~-~~~~^ 2, 1 : La : laa : laa 

|» 

2, _L : _La : Laa : _La 

|p 

2, _L : _La : Laa : L 
Figure 1: Example of a collapsible pushdown graph 

Remark 3.3. Hague et al. [6] showed that modal /i-calculus model checking on level n CPG 
is n-EXPTIME complete. Note that there is an MSO interpretation which turns the graph 
of the previous example into a grid-like structure. Hence its MSO theory is undecidable. 

The next definition introduces runs of collapsible pushdown systems. 
Definition 3.4. Let S be a CPS. A run r of S of length n is a function 

r : {0, 1, 2, . . . , n} Q x (£ x {1, 2} x N)* 2 such that r(0) h r(l) I h r(n). 

We write ln(r) := n and call r a run from r(0) to r(n). We say r visits a stack s at i if 
r W = (?,«)■ 

For runs r, r' of length n and m, respectively, with r(n) = r'(0), we define the compo- 
sition r or' of r and r 1 in the obvious manner. 

Remark 3.5. Note that we do not require runs to start in the initial configuration. 



4. Encoding of Collapsible Pushdown Graphs in Trees 

In this section we prove that CPG are tree-automatic. For this purpose we have to 
encode stacks in trees. The idea is to divide a stack into blocks and to encode different 
blocks in different subtrees. The crucial observation is that every stack is a list of words 
that share the same first letter. A block is a maximal list of words in the stack that share 
the same two first letter^. If we remove the first letter of every word of such a block, the 
resulting 2-word decomposes again as a list of blocks. Thus, we can inductively carry on 
to decompose parts of a stack into blocks and code every block in a different subtree. The 
roots of these subtrees are labelled with the first letter of the corresponding block. This 
results in a tree in which every initial left-closed path represents one word of the stack. By 
left-closed, we mean that the last element of the path has no left successor. 

It turns out that - via this encoding - each stack operation corresponds to a simple 
MSO-definable tree-operation. The main difficulty is to provide a tree-automaton that 
checks whether there is a run to the configuration represented by some tree. This problem 
is addressed in Section [5j 



2 see Figure [5] for an example of blocks and Definition 14.11 for their formal definition 
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Figure 2: Example of blocks in a stack. These form a c-blockline. 



As already mentioned, the encoding works by dividing stacks into blocks. The following 
definition makes our notion of blocks precise. For w G X* and s = w\ : W2 ■ ■ ■ ■ '■ w n G X* 2 , 
we write s' := w \ s for s' = [wwi] : [ww2\ [ww n ]. 



Definition 4.1 (u-block(line)). For a G X, we call 6 € X* 2 a a -block if 6 = [er] or 6 = <tt \ s' 
for some r G X and s' G X* 2 . See Figure [2] for examples of blocks. If b\, b 2 , . . . , b n are a- 
blocks, then we call b\ : 62 : • • • : b n a a-blockline. 

Note that every stack in Stck(X) forms a (_L, 1, 0)-blockline. Furthermore, every block- 
line I decomposes uniquely as / = 61 : 62 : • • • : b n of maximal blocks bi in /. Another crucial 
observation is that a <r-block b G X* 2 \ X decomposes as b = a \ I for some blockline I and 
we say I is the induced blockline of b. For b G X the induced blockline of [b] is just the 
empty 2- word. 

Now we encode a {a, n, m)-blockline / in a tree by labelling the root with (a, n), by 
encoding the blockline induced by the first block of / in the left subtree, and by encoding 
the rest of the blockline in the right subtree. In order to avoid repetitions, we do not repeat 
the symbol (a, n) in the right subtree, but replace it by the default letter e. 

Definition 4.2. Let s = w\ : u>2 : . . . : w n G (X x {1, 2} x N) +2 be a (a, I, fc)-blockline. Let 
w[ be words such that s = (a, I, k) \ [w[ : w' 2 : ■ ■ ■ : w' n ] and set s' := w[ : w' 2 ■ ■ ■ ■ ■ w' n . As 
an abbreviation we write hSi '■= Wh : Wh+i w-i. Furthermore, let w\ : W2 : . . . : Wj be a 

maximal block of s. Note that j > 1 implies Wf = (a,l,k)(a' ,l',k')w", for all f < j, some 
fixed (V, /', k') G X x {1,2} x N, and appropriate w", G X*. For p G (X x {1,2}) U {e}, we 
define recursively the (X x {1,2}) U {ej-labelled tree Enc(s, p) via 



Enc(s, p) :- 



P 



if j = 1, n = 1 
if |uii j = 1, n > 1 
if j = n, \ wi\ > 1 
otherwise. 



/ o(0,Enc( 2 s n ,e)) 

^EncGO',/'))^) 
k p(Enc(is^, (a', l')),Enc( j+1 s n , e)) 

Enc(s) := Enc(s, (_L, 1)) is called the (tree-)encoding of the stack s G Stck(X). 

Figure [3] shows a configuration and its encoding. 

Remark 4.3. In this encoding, the first block of a (a, /, A;)-blockline is encoded in a subtree 
whose root d is labelled (a, I). We can restore k from the position of d in the tree Enc(s) as 
follows. If / = 1 then k = \d\o, i.e., the number of occurrences of in d. This is due to the 
fact that level 1 links always point to the preceding letter and that we always introduce a 
left-successor tree in order to encode letters that are higher in the stack. 

The case / = 2 needs some closer inspection. Assume that some d G T := Enc(s) 
is labelled (cr, 2). Then it encodes a letter (a,2,k) and this is not a cloned element. 
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(c,2,l) (e, 1,3) c,2 e, 1 

(6,2,0) (6,2,0) (c,l,2) (d,2,3) 6 2-1 cl d 2 

(a, 2,0) (a, 2,0) (a, 2, 2) (a, 2, 2) (a, 2, 2) \ \ | 

(±,1,0) (±,1,0) (±,1,0) (±,1,0) (±,1,0) «>2 a,2^e-^e 

t f 
±,1 

Figure 3: A stack s and its encoding Enc(s): right arrows lead to 1-successors (right suc- 
cessors), upward arrows lead to 0-successors (left successors). 



Thus, k equals the numbers of words to the left of this letter (a, 2, k). We claim that 
k = | {e G T n {0, 1}* 1 : e < iex d] \ . The existence of a pair e, el G T corresponds to the 
fact that there is some blockline consisting of blocks b\ : 62 : . . . : b n with n > 2 such that 61 
is encoded in T e \ T e \ and 62 : . . . : 6 n is encoded in T e \. By induction, one easily sees that 
for each such pair e, el G T all the letters that are in words left of the letter encoded by 
el are encoded in lexicographically smaller elements. Furthermore, the size of ((0*)1)* fl T 
corresponds to the number of words in s since the introduction of a 1-successor corresponds 
to the separation of the first block of some blockline from the other blocks. Each of these 
separation can also be seen as the separation of the last word of the first block from the first 
word of the second block of this blockline. Note that we separate two words that are next 
to each other in exactly one blockline. Putting these facts together our claim is proved. 

Another view on this correspondence is the bijection / : {1, 2, . . . , |s|} — > R where 
R := ((0*)1)* H dom(T) and i is mapped to the i-th element of R in lexicographic order. 
f(i) is exactly the position where the (i — l)-st word is separated from the i-th one for all 
i > 2. In order to state the properties of /, we need some more notation. We write n for 
the canonical projection ir : (E x {1, 2} x N)* ->(Sx {1, 2})* and Wi for the i-th word of s. 
Furthermore, let w[ be a word such that, Wi = (wi l~l o (here we set wo := e). Then 
the word along the patfH from the root to f(i) is exactly ir(wi n Wi-\) for all 2 < i < \s\ 
and the path from f(j) to f(j) o m for maximal m 6 N is vr(u/ ) for all 1 < j < \s\. 

In order to encode a configuration c := (q, s), we add q as a new root of the tree and 
attach the encoding of s as the left subtree, i.e., Enc(c) := g(Enc(s), 0). 

The image of this encoding function contains only trees of a very specific type. We call 
this class Tehc- In the next definition we state the characterising properties of Tehc- This 
class is MSO definable, whence automata-recognisable. 

Definition 4.4. Let TEnc be the class of all trees T that satisfy the following conditions. 

(1) The root of T is labelled by some element of Q (T(s) G Q). 

(2) Every element of the form {0, 1}*0 is labelled by some (a, I) G X x {1, 2}; especially, 
T(0) = (±, 1) and there are no other occurrences of (±, 1) or (±, 2). 

(3) Every element of the form {0, 1}*1 is labelled by e. 

(4) 1 g dom(T), G dom(T). 

(5) For all t G T, if T(t0) = (a, 1) then T(tl0) / {a, 1). 



By the word along a path from one node to another we mean the word consisting of the non e-labels 
along this path. 
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(a', I') 




Figure 4: pop 2 -operation 




Figure 5: collapse-operation of level 2. 



Remark 4.5. Note that © holds as T(t0) = T(tlO) = (a, 1) would imply that the subtree 
rooted at t encodes a blockline I such that the first block of I induces a (a, 1, n)-blockline 
and the second one induces a (cr, 1, m)-blockline. But as level 1 links always point to the 
preceding letter, n and m are equal to the length of the prefix of I in the stack plus 1, i.e., 
if T encodes a stack s then s = s\ : [uu \ I] : s 2 and n = m = \w\ + l. This would contradict 
the maximality of the blocks in the encoding. 

Remark 4.6. Enc : Q x Stck(S) — > Ts nc is a bijection and we denote its inverse by Dec. 

Our encoding turns the transitions of a CPG into regular tree-operations. The tree- 
operations corresponding to pop 2 and collapse can be seen in Figures H] and For the pop 2 , 
note that if x>\ is the 0-successor of vq then vq and v\ encode symbols in the same word of 
the encoded stack. As a pop 2 removes the rightmost word, we have to remove all the nodes 
encoding information about this word. As the rightmost leaf corresponds to the topmost 
symbol of the stack, we have to remove this leaf and all its 0-ancestors. 

For the collapse (on level 2), we note that each e represents a cloned element. The 
collapse induced by such an element produces the same stack as a pop 2 of its original 
version. The original symbol of the rightmost leaf is its first ancestor not labelled by e. 

Note that the operations corresponding to pop 2 and collapse are clearly MSO definable. 
All other transitions in CPG correspond to MSO definable tree-operations, too. Due to 
space restrictions we skip the details. 

Lemma 4.7. Let C be the set of encodings of configurations of a CPS S. Then there are 
automata A^ q op ^ for all q € Q and all op € OP such that for all c\, c 2 G C 

A(q,ap) accepts Enc(ci) <g> Enc(c 2 ) iff c\ h^' op ^ c 2 . 
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5. Recognising Reachable Configurations 

We show that Enc maps the reachable configurations of a given CPS to a regular set. 
For this purpose we introduce milestones of a stack s. It turns out that these are exactly 
those substacks of s that every run to s has to visit. Furthermore, the milestones of s are 
represented by the nodes of Enc(s): with every d G Enc(s), we can associate a subtree 
of s which encodes a milestone. Furthermore, the substack relation on the milestones 
corresponds exactly to the lexicographical order <i ex of the elements of Enc(s). For every 
d G Enc(s) we can guess the state in which the corresponding milestone is visited for the 
last time by some run to s and we can check the correctness of this guess using MSO or, 
equivalently, tree-automata. 

We prove that we can check the correctness of such a guess by introducing a special 
type of run, called loop, which is basically a run that starts and ends with the same stack. 
A run from one milestone to the next will mainly consist of loops combined with a finite 
number of stack operations. 

5.1. Milestones 

Definition 5.1 (Milestone). A substack s' of s = w\ : u>2 : • • • : w n is a milestone if 
s' = w\ : W2 Wi : w' such that < i < n and W{ l~l w.i + \ < w' < u>i+i- We denote by 

MS(s) the set of milestones of s. 

Note that the substack relation < linearly orders MS(s). 

Lemma 5.2. If s,t,m are stacks with m G MS(t) but m j£ s, then every run from s to t 
visits to. Thus, for every run r from the initial configuration to s, the function 

f : MS(s) — > dom(r), s' t— > m&x{i G dom(r) : r(i) = (q, s) for some q G Q} 

is an order embedding with respect to substack relation on the milestones and the natural 
order of dom{r) . 

In order to state the close correspondence between milestones of a stack s and the 
elements of Enc(s), we need the following definition. 

Definition 5.3. Let T € Teiic be a tree and d G T \ {e}. Then the left and downward 
closed tree induced by d is LT(d, T) := T\ D where D := {d 1 G T : d 1 <i ex d} \ {e}. Then we 
denote by LStck(d, T) := Dec(LT(d,T)) the left stack induced by d. 

Remark 5.4. LStck(d, s) is a substack of s for all d G dom(Enc(s)). This observation 
follows from Remark 14.31 combined with the fact that the left stack is induced by a lexico- 
graphically downward closed subset. In fact, LStck(d, s) is a milestone of s. 

Lemma 5.5. The map given by g : d i-> LStck(d, Enc(s)) is an order isomorphism between 
(dom(Enc(q,s))\{e},<i ex ) and (MS(s),<). 

Lemmas 15.51 and 15.21 imply that every run r decomposes as r = r\ o r2 o . . . o r n where 
r, is a run from the i-th milestone of r(ln(r)) to the (i + l)-st milestone. 

In order to describe the structure of the rj, we have to introduce the notion of a loop. 
Informally speaking, a loop is a run r that starts and ends with the same stack s and which 
does not look too much into s. 

Definition 5.6. Let r be a run of length n with r(i) = (gj, Sj) for all < i < n. 
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• r is called a simple high loop if so = s n and if so < s i for all < % < n. 

• r is called a simple low loop of s if so = s n = s, between and n the stack s is never 
visited, s\ = pop 1 (s), CLvl(s) = 1, |sj| > |s| for all < i < n, and rfp^-i] is the 
composition of simple low loops and simple high loops of pop 1 (s). 

• r is called loop if it is a finite composition of low loops and high loops. 

Lemma 5.7. Let s be some stack, mi,m 2 milestones of s, and r a run from m\ to ni2 that 
never visits any other milestone of s. Then either r = l\ o p o / 2 or r = 1$ o c o l\ o p\ o I2 o 
p 2 o l 3 o . . . o p n o l n+1 where each li is a loop, and all Pi,p, and c are runs of length 1, p 
performs one push CTfc; c performs one clone 2; and the pi perform one pop x each. 

This lemma motivates why we only define low loops for stacks s with CLvl(s) = 1. 
Whenever the topmost symbol of a milestone m is not a cloned element, then pop 1 (m) is 
another milestone. Hence, the li can only contain low loops if they start at a stack with 
cloned topmost symbol. But any stack s with cloned topmost symbol and CLvl(s) = 2 
cannot be restored from pop 1 (s) without passing pop 2 (s) since a push^ -operation would 
create the wrong link-level. 

From Lemma [5 .71 we can derive that deciding whether there is a run from one milestone 
to the next is possible if we know the pairs of initial and final states of loops of certain 
stacks s. Hence we are interested in the sets Loops(s) C Q x Q with (91,(72) € Loops(s) if 
and only if there is a loop from (qi,s) to (q2,s). The crucial observation is that Loops(s) 
may be calculated by a finite automaton reading top 2 (s). 

Lemma 5.8. For every CPS there exists a finite automaton A that calculate^ on in- 
put w 6 (S x {1,2})* the set Loops(s) for all stacks s such that w = 7r(top 2 (s)). Here, 
7r : (S x {1,2} x N)* 4(Ex {1,2})* is the projection onto the symbols and collapse-levels. 

5.2. Detection of Reachable Configurations 

We have already seen that every run to a valid configuration (q, s) passes all the mile- 
stones of s. Now, we use the last state in which a run r to (q,s) visits each milestone as 
a certificate for the reachability of (q, s). To be precise, a certificate for the reachability of 
(q,s) is a map / : dom(Enc((7, s)) \ {e} — > Q such that there is some run r from _L 2 to 
(q, s) and f{d) = q if and only if r(i) = (g, LStck(d)) for i the maximal position in r where 
LStck(d) is visited. 

Lemma 5.9. For every CPG G, there is a tree- automaton that checks for each map 

f : dom(Enc(q, s)) \ {e} — > Q 

whether f is a certificate of the reachability of (q,s), i.e., whether f is induced by some run 
r from the initial configuration to (q,s). 

The proof of the lemma uses Lemma 15.81 and the fact that the path from the root to 
some d € Enc(s) encodes the topmost word of LStck(d, Enc(s)). Hence, a tree automaton 
reading Enc(s) is able to calculate for each position d € Enc(s) the pairs of initial and final 
states of loops of LStck(d). As every run decomposes as a sequence of loops separated by 
a single operation, knowing Loops(s') for each s' < s enables the automaton to check the 
correctness of a candidate for a certificate of reachability. 



We consider the final state reached by A on input w as the value it calculates for w. 



TREE-AUTOMATICITY OF 2-CPG 
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As a tree-automaton may non-deterministically guess a certificate of the reachability of 
a configuration, the encodings of reachable configurations form a regular set. 

5.3. Extension to Regular Reachability 

By now, we have already established the tree-automaticity of each CPG G since we 
have seen that our encoding yields a regular image of the vertices of G and the transition 
relations are turned into regular relations of the tree encoding. Using similar techniques, 
we can improve this result: 

Theorem 5.10. If G is the e-closure of some CPG G' then (G, Reach) is tree- automatic 
where Reach is the binary predicate that is true on a pair (01,02) of configurations if there 
is a path from c\ to C2 in G. 

Remark 5.11. Each graph in the second level of the Caucal-hierarchy can be obtained as 
the e-contraction of some level 2 CPG (see [3]) whence all these graphs are tree-automatic. 

For a CPS S let R C A* be a regular language over the transitions of S. As collapsible 
pushdown graphs are closed under products with finite automata even the reachability pred- 
icate Reach r with restriction to R is tree-automatic. Here, Reach^rcy holds if there is a path 
from x to y in CPG(5) that uses a sequence of transitions in R. If A is the automaton recog- 
nising R, we obtain that Reach#(g, s)(q', s') holds in CPG(S*) iff Reach((g, qi), s) (((?', qj), s') 
holds in CPG(S' x A) where qi is the initial and qf the unique final state of A. Using this 
idea one can define a CPG G' which is basically CPG(5 U (S x A)) extended by transitions 
from (q,s) to ((q,qi),s) and to ((q,qf),s). CPG(5) as well as Reach^ w.r.t. CPG(S) are 
FO [Reach] -interpretable in Q . Hence we obtain: 

Theorem 5.12. Given a collapsible pushdown graph of level 2, its FO[Reach#] theory is 
decidable for each regular R C A* . 

5.4. Computation of concrete tree-automatic representations of CPG 

Up to now, we have only seen that there is a tree-automatic representation for each 
CPG. For computing a concrete representation, we rely on the following lemma. 

Lemma 5.13. Given some CPS S = (T,Q, A,qo), some q G Q, and some stack s, it is 
decidable whether (q,s) is a vertex of CPG(5). 

The proof is based on the idea that a stack is uniquely determined by its top element and 
the information which substacks can be reached via collapse- and popj-operations. Hence we 
can construct an extension S' of S and a modal formula cp qs such that there is some element 
v G CPG(S') satisfying CPG(5'),u (= <Pq,a iff (q,s) G CPG(S). S' basically contains new 
states for every substack of s and connects the different states via the appropriate pop r 
operations which are only applied if the topmost symbol of the stack agrees with the symbol 
we would expect when starting the pop r sequence in configuration (q,s). 

From this lemma we can derive the computability of the automata in Lemma 15.81 
Having obtained these automata, the construction of a tree-automatic representation of 
some CPG is directly derived from the proofs yielding the following theorem. 

Theorem 5.14. There is an algorithm that, given a level 2 CPG G and regular sets 
Ri, . . . ,R n C A* ; computes a tree- automatic representation of (G, Reach^, . . . , Reachj^). 
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6. Conclusion 

We have seen that level 2 collapsible pushdown graphs are tree-automatic. This result 
holds also if we apply e-contractions and if we add regular reachability predicates. This 
implies that the second level of the Caucal-hierarchy is tree-automatic. But our result can 
only be seen as a starting point for further investigations of the CPG hierarchy: are level 3 
collapsible pushdown graphs tree-automatic? We know an example of a level 5 CPG which 
is not tree-automatic. But even when tree-automaticity of all CPG cannot be expected, 
the question remains whether all CPG have decidable FO theories. In order to solve this 
problem one has to come up with new techniques. 

A rather general question concerning our result aims at our knowledge about tree- 
automatic structures. Recent developments in the string case [9] show the decidability of 
rather large extensions of first-order logic for automatic structures. It would be interesting 
to clarify the status of the analogous claims for tree-automatic structures. Positive answers 
concerning the decidability of extensions of first-order logic on tree-automatic structures 
would give us the corresponding decidability results for collapsible pushdown graphs of 
level 2. 
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